skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zilles, Craig"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Background: Previous work has shown that students can understand more complicated pieces of code through the use of common software development tools (code execution, debuggers) than they can without them. Objectives: Given that tools can enable novice programmers to understand more complex code, we believe that students should be explicitly taught to do so, to facilitate their plan acquisition and development as independent programmers. In order to do so, this paper seeks to understand: (1) the relative utility of these tools, (2) the thought process students use to choose a tool, and (3) the degree to which students can choose an appropriate tool to understand a given piece of code. Method: We used a mixed-methods approach. To explore the relative effectiveness of the tools, we used a randomized control trial study (𝑁 = 421) to observe student performance with each tool in understanding a range of different code snippets. To explore tool selection, we used a series of think-aloud interviews (𝑁 = 18) where students were presented with a range of code snippets to understand and were allowed to choose which tool they wanted to use. Findings: Overall, novices were more often successful comprehending code when provided with access to code execution, perhaps because it was easier to test a larger set of inputs than the debugger. As code complexity increased (as indicated by cyclomatic complexity), students become more successful with the debugger. We found that novices preferred code execution for simpler or familiar code, to quickly verify their understanding and used the debugger on more complex or unfamiliar code or when they were confused about a small subset of the code. High-performing novices were adept at switching between tools, alternating from a detail-oriented to a broader perspective of the code and vice versa, when necessary. Novices who were unsuccessful tended to be overconfident in their incorrect understanding or did not display a willingness to double check their answers using a debugger. Implications: We can likely teach novices to independently understand code they do not recognize by utilizing code execution and debuggers. Instructors should teach students to recognize when code is complex (e.g., large number of nested loops present), and to carefully step through these loops using debuggers. We should additionally teach students to be cautious to double check their understanding of the code and to self-assess whether they are familiar with the code. They can also be encouraged to strategically switch between execution and debuggers to manage cognitive load, thus maximizing their problem-solving capabilities. 
    more » « less
  2. To address the challenges of running exams in large enrollment CS courses, we set up and operated an in-person testing center at a minority serving institution. We have run the testing center for two quarters, proctoring over 6,000 exams for eight CS courses with approximately 1,800 students. In this experience report, we discuss the motivation for the testing center, its set-up and operation, and the lessons that we have learned from our first two quarters of operation. In addition, we present student and instructor feedback regarding use of the testing center, future steps, and improvements. By sharing, we hope that other schools can learn from our experience and improve upon our methods to help establish best practices for testing center configuration and operation. 
    more » « less
  3. Battestilli, Lina; Rebelsky, Samuel A; Shoop, Libby (Ed.)
    We compare the exam security of three proctoring regimens of Bring-Your-Own-Device, synchronous, computer-based exams in a computer science class: online un-proctored, online proctored via Zoom, and in-person proctored. We performed two randomized crossover experiments to compare these proctoring regimens. The first study measured the score advantage students receive while taking un-proctored online exams over Zoom-proctored online exams. The second study measured the score advantage of students taking Zoom-proctored online exams over in-person proctored exams. In both studies, students took six 50-minute exams using their own devices, which included two coding questions and 8–10 non-coding questions. We find that students score 2.3% higher on non-coding questions when taking exams in the un-proctored format compared to Zoom proctoring. No statistically significant advantage was found for the coding questions. While most of the non-coding questions had randomization such that students got different versions, for the few questions where all students received the same exact version, the score advantage escalated to 5.2%. From the second study, we find no statistically significant difference between students’ performance on Zoom-proctored vs. in-person proctored exams. With this, we recommend educators incorporate some form of proctoring along with question randomization to mitigate cheating concerns in BYOD exams. 
    more » « less
  4. The ability of students to “Explain in Plain English” (EiPE) the purpose of code is a critical skill for students in introductory programming courses to develop. EiPE questions serve as both a mechanism for students to develop and demonstrate code comprehension skills. However, evaluating this skill has been challenging as manual grading is time consuming and not easily automated. The process of constructing a prompt for the purposes of code generation for a Large Language Model, such OpenAI’s GPT-4, bears a striking resemblance to constructing EiPE responses. In this paper, we explore the potential of using test cases run on code generated by GPT-4 from students’ EiPE responses as a grading mechanism for EiPE questions. We applied this proposed grading method to a corpus of EiPE responses collected from past exams, then measured agreement between the results of this grading method and human graders. Overall, we find moderate agreement between the human raters and the results of the unit tests run on the generated code. This appears to be attributable to GPT-4’s code generation being more lenient than human graders on low-level descriptions of code 
    more » « less
  5. Background and context. “Explain in Plain English” (EiPE) questions ask students to explain the high-level purpose of code, requiring them to understand the macrostructure of the program’s intent. A lot is known about techniques that experts use to comprehend code, but less is known about how we should teach novices to develop this capability. Objective. Identify techniques that can be taught to students to assist them in developing their ability to comprehend code and contribute to the body of knowledge of how novices develop their code comprehension skills. Method. We developed interventions that could be taught to novices motivated by previous research about how experts comprehend code: prompting students to identify beacons, identify the role of variables, tracing, and abstract tracing. We conducted think-aloud interviews of introductory programming students solving EiPE questions, varying which interventions each student was taught. Some participants were interviewed multiple times throughout the semester to observe any changes in behavior over time. Findings. Identifying beacons and the name of variable roles were rarely helpful, as they did not encourage students to integrate their understanding of that piece in relation to other lines of code. However, prompting students to explain each variable’s purpose helped them focus on useful subsets of the code, which helped manage cognitive load. Tracing was helpful when students incorrectly recognized common programming patterns or made mistakes comprehending syntax (text-surface). Prompting students to pick inputs that potentially contradicted their current understanding of the code was found to be a simple approach to them effectively selecting inputs to trace. Abstract tracing helped students see high-level, functional relationships between variables. In addition, we observed student spontaneously sketching algorithmic visualizations that similarly helped them see relationships between variables. Implications. Because students can get stuck at many points in the process of code comprehension, there seems to be no silver bullet technique that helps in every circumstance. Instead, effective instruction for code comprehension will likely involve teaching a collection of techniques. In addition to these techniques, meta-knowledge about when to apply each technique will need to be learned, but that is left for future research. At present, we recommend teaching a bottom-up, concrete-to-abstract approach. 
    more » « less
  6. Fisler, Kathi; Denny, Paul; Franklin, Diana; Hamilton, Margaret (Ed.)
    Background: Prior work has primarily been concerned with identifying: (1) how Open Education Resources (OERs) can be used to increase the availability of educational materials, (2) what mo- tivations are behind their adoption and usage in classrooms, and (3) what barriers impede said adoption. However, there is relatively little work investigating the motives and barriers to contribution in OER. Objectives: Our goal is to understand what motivates and dissuades instructors to contribute to and adopt OERs. Additionally, we wish to know what would increase the likelihood of instructors contributing their work to OER repositories. Method: We conduct a 10 question survey with computing instructors on OER, with a heavy emphasis on what would lead to OER contributions. Using thematic analysis, we mine the broad themes from our respondents and group them into broader topical areas. Findings: Novel contributions include discussions of what faculty are not willing to share as readily — in particular, exam questions are of concern due to possible student cheating — as well as discussions of different views on monetary and non-monetary (e.g., promotion and tenure value) incentives for contributing to OER efforts. With respect to the kinds of OER faculty want to use, findings line up with prior literature. Implications: As course materials become more sophisticated and the range of topics taught in computing continue to grow, the communal effort required to maintain a broad collection of high quality OERs also grows. Understanding what factors influence instructors to contribute to this effort and how we can facilitate the contribution, discovery, and use of OERs is fundamental to both how OER repositories should be organized, as well as how funding initiatives to support them should be structured. 
    more » « less
  7. Explain in Plain English (EiPE) questions evaluate whether students can understand and explain the high-level purpose of code. We conducted a qualitative think-aloud study of introductory programming students solving EiPE questions. In this paper, we focus on how students use tracing (mental execution) to understand code in order to explain it. We found that, in some cases, tracing can be an effective strategy for novices to understand and explain code. Furthermore, we observed three problems that prevented tracing from being helpful, which are 1) not employing tracing when it could be helpful (some struggling students explained correctly after the interviewer suggested tracing the code), 2) tracing incorrectly due to misunderstandings of the programming language, and 3) tracing with a set of inputs that did not sufficiently expose the code’s behavior (upon interviewer suggesting inputs, students explained correctly). These results suggest that we should teach students to use tracing as a method for understanding code and teach them how to select appropriate inputs to trace. 
    more » « less
  8. In technical writing, certain statements must be written very carefully in order to clearly and precisely communicate an idea. Students are often asked to write these statements in response to an open- ended prompt, making them difficult to auto-grade with traditional methods. We present what we believe to be a novel approach for auto-grading these statements by restricting students’ submissions to a pre-defined context-free grammar (configured by the instructor). In addition, our tool provides instantaneous feedback that helps students improve their writing, and it scaffolds the process of constructing a statement by reducing the number of choices students have to make compared to free-form writing. We evaluated our tool by deploying it on an assignment in an undergraduate algorithms course. The assignment contained five questions that used the tool, preceded by a pre-test and followed by a post-test. We observed a statistically significant improvement from the pre-test to the post-test, with the mean score increasing from 7.2/12 to 9.2/12. 
    more » « less
  9. Errors in AI grading and feedback are by their nature non-deterministic and difficult to completely avoid. Since inaccurate feedback potentially harms learning, there is a need for designs and workflows that mitigate these harms. To better understand the mechanisms by which erroneous AI feedback impacts students’ learning, we conducted surveys and interviews that recorded students’ interactions with a short-answer AI autograder for ``Explain in Plain English'' code reading problems. Using causal modeling, we inferred the learning impacts of wrong answers marked as right (false positives, FPs) and right answers marked as wrong (false negatives, FNs). We further explored explanations for the learning impacts, including errors influencing participants’ engagement with feedback and assessments of their answers’ correctness, and participants’ prior performance in the class. FPs harmed learning in large part due to participants’ failures to detect the errors. This was due to participants not paying attention to the feedback after being marked as right, and an apparent bias against admitting one’s answer was wrong once marked right. On the other hand, FNs harmed learning only for survey participants, suggesting that interviewees’ greater behavioral and cognitive engagement protected them from learning harms. Based on these findings, we propose ways to help learners detect FPs and encourage deeper reflection on FNs to mitigate learning harms of AI errors. 
    more » « less